Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33364][SQL] Introduce the "purge" option in TableCatalog.dropTable for v2 catalog. #30267

Closed
wants to merge 3 commits into from

Conversation

imback82
Copy link
Contributor

@imback82 imback82 commented Nov 5, 2020

What changes were proposed in this pull request?

This PR proposes to introduce the purge option in TableCatalog.dropTable so that v2 catalogs can use the option if needed.

Related discussion: #30079 (comment)

Why are the changes needed?

Spark DDL supports passing the purge option to DROP TABLE command. However, the option is not used (ignored) for v2 catalogs.

Does this PR introduce any user-facing change?

This PR introduces a new API in TableCatalog.

How was this patch tested?

Added a test.

@imback82
Copy link
Contributor Author

imback82 commented Nov 5, 2020

cc @rdblue @cloud-fan

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35275/

@SparkQA
Copy link

SparkQA commented Nov 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35275/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @imback82 .

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Test build #130664 has finished for PR 30267 at commit 18669b7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class DropTableExec(

* <p>
* If the purge option is set to false, the default implementation falls back to
* {@link #dropTable(Identifier)} dropTable}. Otherwise, it throws
* {@link UnsupportedOperationException}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should mention that, people need to overwrite this method to support the purge option.

* If the catalog supports views and contains a view for the identifier and not a table, this
* must not drop the view and must return false.
* <p>
* If the catalog supports the option to purge a table, this method must be overwritten.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overwritten -> overridden.

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35300/

@dongjoon-hyun
Copy link
Member

Merged to master. The last two commits are only for comment changes.

@imback82
Copy link
Contributor Author

imback82 commented Nov 6, 2020

Thanks @dongjoon-hyun / @cloud-fan for the review!

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35300/

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35302/

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35302/

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Test build #130692 has finished for PR 30267 at commit b36af8d.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 6, 2020

Test build #130689 has finished for PR 30267 at commit c63560f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rdblue
Copy link
Contributor

rdblue commented Nov 7, 2020

Late +1 from me. Thanks @imback82!

HyukjinKwon pushed a commit that referenced this pull request Dec 23, 2020
### What changes were proposed in this pull request?

This is a followup of #30267

Inspired by #30886, it's better to have 2 methods `def dropTable` and `def purgeTable`, than `def dropTable(ident)` and `def dropTable(ident, purge)`.

### Why are the changes needed?

1. make the APIs orthogonal. Previously, `def dropTable(ident, purge)` calls `def dropTable(ident)` and is a superset.
2. simplifies the catalog implementation a little bit. Now the `if (purge) ... else ...` check is done at the Spark side.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

existing tests

Closes #30890 from cloud-fan/purgeTable.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Dec 23, 2020
This is a followup of #30267

Inspired by #30886, it's better to have 2 methods `def dropTable` and `def purgeTable`, than `def dropTable(ident)` and `def dropTable(ident, purge)`.

1. make the APIs orthogonal. Previously, `def dropTable(ident, purge)` calls `def dropTable(ident)` and is a superset.
2. simplifies the catalog implementation a little bit. Now the `if (purge) ... else ...` check is done at the Spark side.

No.

existing tests

Closes #30890 from cloud-fan/purgeTable.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit ec1560a)
Signed-off-by: HyukjinKwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants